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© Verfahren zur Auswahl eines wahrscheinlichen Bewegungsvetctors fur eine Echtzeitbewegungsschatzung bei 
Bewegtbitdsequenzen 



® Der Anmeldungsgegenstand betrifft ein Verfahren, bei 
dam aus einem aktuelien Bildausschnitt (S^k,!)) und aus 
efnem vorhergehenden Bildausschnitt (S f .,(k+i,l+j)) DHfe- 
renzen gebildat werden, die ubar sine treppenformige 
Quanttsierungskenniinie mit, insbesonders zur Basis rwei, 
exponential! ansteigender Stufenbrerte und Stufenhdhe je- 
weils einer quantisierte Differs nx (T(k.!,i.j)) zugeordnet war- 
den, die fur einen jeweiligen Bewegungsvektor zu einam 
jewailigen Summenwert {LPOC(i,j)) aufsummiert warden, 
um anschlfeSend einen wahrscheinlichsten Bewogungsvek- 
tor mit dam geringstan Summenwert zu ermitteln. Die 
Vorteile liegen in der einfachen Realtsierbarkett sowohl ale 
Hardware a!s auch ate Software und in dam vergleichswetse 
gutan Signsl/Rausch-Abstand. 
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Beschreibung 

Eine Bewegungsschatzung in Bewegtbildsequenzen 
wird zum Beispiel zur Reduktion dcr zeitlichen Redun- 
danz in der Bildcodierung und zur Interpolation von 
Zwischenbiidem in einer Bildersequenz verwendet Da- 
zu werden die Bilder in kleinere Bildausschnitte zerlegt 
und fur jeden BUdausschnitt wird ein Bewegungsvektor 
bezogen auf das vorhergehende Bild ermittelL Der Be- 
wegungsvektor kann zur Rekonstniktion des aktuellen 
Bildausschnittes aus der Information des vorhergehen- 
den Bildes verwendet werden. In der Regel setzt sich die 
Bewegungsschatzung aus der Bestimmung der zwei in 
der Bildebene gelegenen translatorischen Komponen- 
ten des Bewegungsvektors zusammen. Fur alle betrach- 
teten Bewegungen eines Bildausschnittes mufl ein Ver- 
fahren zur Auswahl des wahrscheinlichsten Bewegungs- 
vektors durchgcfuhrt werden. 

Aus den IEEE Transactions on Circuits and Systems, 
VOL. 37, NO. 5. MAY 1990, Seiten 649 - 651 , sind solche 
Verfahren bekannt, wobei beim sogenannten MSD- 
Verfahren die Summe der Quadrate oder beim soge- 
nannten MAD- Verfahren die Summe der Absolutwerte 
der Differenzen zwischen den Pixelwerten des aktuellen 
Bildausschnitts und den Pixelwerten eines entsprechen- 
den Bildausschnitts des vorhergehenden Bildes oder 
aber beim sogenannten PCD- Verfahren die Anzahl der 
absoluten Dif f erenzen zwischen den Pixelwerten des 
aktuellen Bildausschnitts und der Pixelwerte eines ent- 
sprechenden Bildausschnittes des vorhergehenden Bil- 
des, die eine vorgegebene Schwelle unterschreiten als 
Kriterium fur die Auswahl des wahrscheinlichsten Be- 
wegungsvektor dienen. 

Das MSD- Verfahren ist wegen der Quadrierung sehr 
aufwendig und das das MAD- Verfahren liefert ein ver- 
gleichsweise schlechtes Signal/Rausch-Verhaltnis 
(SNR). Das PCD- Verfahren ist sehr einfach realisierbar. 
liefert aber nur sehr unbefriedigende Ergebnisse hin- 
sichtlich des Signal/ Rausch-Verhal trusses, 

Aus der deutschen Offenlegungsschrift 
DE42 21 320 A 1 ist eine Bewegungsvektor-Erfassungs- 
vorrichtuung bekannt, bei der ohne Erhohung des 
Schaltungsaufwandes die Erfassungsgenauigkeit da- 
durch erhoht wird, daB eine Vielzahl von Satzen repra- 
sentativer Punkte gespeichert werden, die im selben In- 
tervall ausgewahlt sind wie jene fQr Suchbereiche, deren 
jeder aus Q*R Pbceln eines Bildes eines Teilbildes be- 
steht, welches dem gerade vorliegenden Teilbild urn ein 
oder mehrere Teilbilder vorangeht 

Aus der deutschen Offenlegungsschrift 
DE 43 44 924 Al sind ein Verfahren und eine Vorrich- 
tung zur Bewegungsschatzung bekannt, wobei die Aus- 
wertung nur bestimmter Bits, z. B. des MSB oder der 
beiden MSBs, der Bildpunkt-Werte erfolgt, um Rechen- 
zeit zu sparen. 

Aus der deutschen Patentschrift DE 40 23 449 Cl ist 
ein Verfahren zum Bestimmen von Bewegungsvektoren 
fur TeQbildbereiche einer Quellbildsequenz bekannt, bei 
dem zunachst zwei Bewegungsvektoren in einem Bild 
mit reduzierter Auflosung gesucht werden und dann 
zwischen dem Nullvektor und den beiden Vektoren eine 
verf einerte Suche durchgefuhrt wird. 

Aus den Proceedings of the IEEE, VoL 83, No. 6, June 
1965 sind Bewegungsschatztechniken fur das digitale 
Ferns e hen bekannt, bei denen die Bandbreite der hin- 
sichtlich der des Vorhersagefehlers und der Bewegungs- 
parameter dadurch minirniert werden, daB ein lokal 
adaptivcs Mehrfachgitter- Block- Matching angewendet 
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wird. 

Die der Erfindung zu Grunde liegende Aufgabe liegt 
nun darin, ein Verfahren zur Auswahl eines wahrschein- 
lichen Bewegungsvektors fur eine Echtzeitbewegungs- 
5 schatzung bei Bewegtbildsequenzen anzugeben, bei 
dem die Vorteile der obengenannten Verfahren vereint 
werden. 

Diese Aufgabe wird erfindungsgemaB durch die 
Merkmale des Patentanspruchs 1 gelost Vorteilhafte 

io Weiterbildungen des Verfahrens ergeben sich aus den 
Un teranspruchen. 

Die Erfindung wird nachfolgend anhand der Zeich- 
nungen naher erlautert Dabei zeigt 
Fig. 1 ein Abiaufdiagramm zur Eriauterung des erfin- 

15 dungsgemafien Verfahrens, 

Fig. 2 eine Prinzipschaltbild zur Eriauterung einer 
vorteilhaften Weiterbildung des erfindungsgemaBen 
Verfahrens und 
Fig. 3 ein Beispiel einer Quantisierung innerhalb des 

20 erfindungsgemaBen Verfahrens. 

In Fig. 1 ist ein Abiaufdiagramm zur Eriauterung des 
erfindungsgemaBen Verfahrens gezeigt, wobei die auf- 
einanderfolgenden Verfahrenschritte Differenzbildung 
D, Absolutwertbildung A, Quantisierung Q, Summenbil- 

25 dung S und Minimumbildung MIN durch miteinander 
verbundene Blocke angedeutet sind Wahrend der Dif- 
ferenzbildung D werden aus einem aktuellen BUdaus- 
schnitt Sf(k, 1) und aus einem vorhergehenden BUdaus- 
schnitt Sf_i(k+i, fur eine Mehrzahi von Pixelvek- 

30 toren (k, 1) der BUdausschnitte und fur eine Mehrzahi 
Bewegungsvektoren (i, j) Differenzen d(k, 1, i, j) bezie- 
hungsweise durch die anschlieBende Absolutwertbil- 
dung A die Absolutwerte der Differenzen d(k, 1, i, j) 
gebildet Dem Absolutbetrag einer jeweUigen Differenz 

35 wird nun anschlieBend wahrend der Quantisierung Q 
uber eine treppenformige Kennlinie mit exponentiell 
ansteigender Stufenbreite, in Fig. 3 mit SB bezeichnet, 
und exponentieU ansteigender StufenhOhe, in Fig. 3 mit 
SH bezeichnet, jeweils eine quantisierte Differenz T(k, 1, 

40 i, j) zugeordnet Fur einen jeweiligen Bewegungsvektor 
(i, j) werden anschlieBend enrweder aUe quantisierten 
Differenzen selbst oder aber alle quadrierten quantisier- 
ten Differenzen uber alle Pixervektoren der Mehrzahi 
von Pixervektoren (k. 1) wahrend der SummenbUdung S 

45 zu einem jeweiligen Summenwert LPDCft j) aufsum- 
miert, wobei eine Summenbildung uber k und eine Sum- 
menbUdung fiber 1 durchzufuhren ist Zum SchluB wird 
wahrend der Minimumbildung MIN ein wahrscheinlich- 
ster Bewegungsvektor dadurch ermittelt, dafl derjenige 

so Bewegungsvektor mit dem geringsten Summenwert 
Min( LPCD(i, j)) ermittelt wird. 

In Fig. 3 ist ein Beispiel einer vorteilhaften Quantisie- 
rung Q in Form eine Quantisierungskennlinie mit der 
Differenz d(k, 1, i, j) auf der Ordinate und der quantisier- 

55 ten Differenz T(k, 1, i, j) auf der Abszisse dargestellt, 
wobei die Stufenhohe und die Stufenbreite jeweils zur 
Basis 2 exponentieU ansteigt Die zugehorige Quantisie- 
rungskennlinie kann also hier wie folgt beschrieben 
werden: 

60 

T(k, I i, j) - 0 wenn |Sf(k. 1) - Sf- i(k + i. 1 *r j)| - 0 
Tfc U j) - 1 wenn 1 < - |S<k, ))-S f -i(k+i, 1 + j)| < 2 
T(k, U, j) = 2 wenn 2 < = 1 S<k, I) - Sf- i(k + U + j)| < 4 
T(k, U j)-4 wenn 4 < - S<k. 1)-S f _ t(k+i, I + j)| < 8 
65 T(k, l i, J)=2 n - 2 wenn =|Sf(k, l)-Sf_i(k+i, 

l+j)l<2 a - 1 

T(k, U j )=2 n - 1 wenn 2°- l < =|Sf(k, l)-Sf_i(k+i, 
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Die Stufenbreite SB und Stufenhohe SH der treppen- 
formigen QuantisierungskennJinie kann zwar zu einer 
beliebigen Basis exponenuell ansteigen, es ist jedoch 
von Vorteil wenn die Stufenbreite SB der treppenfonni- 
gen Quantisierungskennlinie zur Basis 2 m exponentiell 5 
und die Stufenh6he SH der treppenfdrmigen Quantisie- 
rungskennlinie zur Basis 2 P exponentiell ansteigt, wobei 
m und p positive ganze Zahlen groBer gleich Ens sind 
una wobei m und p audi gleich'groB sein konnen. 

Fur das Verfahren mit der in Fig. 3 gezeigten Kennli- i 0 
nie, also fur einen exponentiellen Anstieg der Stufenho- 
he und der Stufenbreite zur Basis 2, ist eine rechentech- 
nisch besonders einfach durchzufOhrende Weiterbil- 
dung des erfindungsgemaBen Verfahrens moglich, die 
anhand der in Fig. 2 gezeigten Prinzipschaitung im fol- 15 
genden naher erlautert wird Diese Prinzipschaitung 
weist ein Register REG mit den SteUen Rl . . . R5 und 
einer Vorzeichens telle VZ, ein Akkumulatorregister 
ACC mit den Bitsteflen Al . . . A5 und einer Vorzeichen- 
stelle VZ, eine Einheit NV zum bitweisen Nullvergleich 20 
sowie Auswahischaiter Si und S2 aui, wobei das Regi- 
ster REG zur Aufnahme der Differenz d(k, 1, i, j) und das 
Akkumulatorregister ACC zu Aufnahme der quantifi- 
zierten Differenz T(k, 1, i, j) dienen. Die Quantisierung 
wird hierbei durch Maskierung der binaren Darstellung 25 
der jeweiligen Differenz d(k, U j) erzeugt Die binare 
Darstellung der jeweiligen Differenz wird dabei ausge- 
hend vom hdchstwertigen Bit R5 Bit fur Bit nacheinan- 
der so lange ausgeiesen, bis mit Hilfe der Einheit NV 
festgestellt wird das gerade ausgelesene Bit verschieden 30 
von Null ist Die Absolutwertbildung erfolgt auf sehr 
einfache Weise dadurch, daB dabei das Vorzeichenbh 
VZ des Registers REG ignoriert wird. Ist das uber den 
Auswahischaiter Si gerade ausgelesene Bit ungleich 
Null, so wird es Qber den Schalter S2 an einer entspre- 35 
chenden Bitstelle im Akkumulatorregister auf addiert 

Werden fur einen jeweiligen Bewegungsvektor ft j) 
alle quantisierten Differenzen Qber alle Pixelvektoren 
der Mehrzahl von Pixelvektoren (k, 1) zu einem jeweiU- 
genSurnmenwert(LPDqi,j))aufsummiert,sowirddas 40 
von Null verschiedene Bit stellenrichdg aufaddiert Das 
bedeutet beispielsweise, daB das von Null verschiedene 
Bit an der Stelle R2 im Register an der Stelle A2 im 
Akkumulatorregister ACC aufaddiert wird 

Entsprechend konnen auch fur einen jeweiligen Be- 45 
wegungsvektor ft j) afle quadrierten quantisierten Dif- 
ferenzen Qber alle Pixelvektoren der Mehrzahl von Pi- 
xelvektoren (k, 1) dadurch zu einem jeweiligen Summen- 
wert (LPDC(i, j)) aufsummiert werden, daB das von Null 
verschiedene Bit um die seiner Wertigkeit entsprechen- 50 
den Anzahl von SteUen erhdht in einem Akkumulator 
aufaddiert wird, wodurch sich eine Verdopplung der 
jeweiligen Wertigkeit ergibt Das heiBt, daB im obigen 
Beispiei nicht in der Stelle A2 sondern in der SteUe A4 
des Akkumulatorregisters ACC aufaddiert wird 55 

Patentanspriiche 

1. Verfahren zur Auswahl eines wahrscheinlichen 
Bewegungsvektors fur eine Echtzeitbewegungs- eo 
schatzung bei Bewegtbildsequenzen, 
bei dem a us einem aktuellen Biidausschnitt (Sfflc, 1)) 
und aus einem vorhergehenden Biidausschnitt (Sf_. 
i(k-K 1+j)) fur eine Mehrzahl von Pixelvektoren 
(k, 1) der Bildausschnitte und fur eine Mehrzahl Be- es 
wegungsvektoren ft j) Differenzen gebildet wer- 
den, 

bei dem dem Absolutbetrag einer jeweiligen Diffe- 



renz fiber eine treppenformige Quantisierungs- 
kennlinie mit exponentiell ansteigender Stufenbrei- 
te und exponentiel] ansteigender Stufenhohe je- 
weils eine quantisierte Differenz (Tfk, 1, i, j)) zuge- 
ordnet wird 

bei dem fur einen jeweiligen Bewegungsvektor ft j) 
alle quantisierten Differenzen selbst beziehungs- 
weise alle quadrierten quantisierten Differenzen 
uber alle Pixelvektoren der Mehrzahl von Pixel- 
vektoren (k, 1) zu einem jeweiligen Summenwert 
(LPDCftj)) aufsummiert werden und 
bei dem der wahrscheinlichste Bewegungsvektor 
dadurch ermittelt wird daB derjenige Bewegungs- 
vektor mit dem geringsten Summenwert ermittelt 
wird 

2. Verfahren zur Auswahl eines wahrscheinlichen 
Bewegungsvektors nach Anspruch 1, bei dem die 
Stufenbreite (SB) der treppenformigen Quantisie- 
rungskennlinie zur Basis 2 m exponentiell und die 
Stufenhohe (SH) der treppenformigen Quantisie- 
rungskennlinie zur Basis 2? exponentiell" ansteigt, 
wobei m und p positive ganze Zahlen groBer gleich 
Einssind 

3. Verfahren zur Auswahl eines wahrscheinlichen 
Bewegungsvektors nach Anspruch 2, bei dem die 
Stufenbreite (SB) und die Stufenhohe (SH) zur glei- 
chen Basis exponentiell ansteigen. 

4. Verfahren zur Auswahl eines wahrscheinlichen 
Bewegungsvektors nach Anspruch 3, bei dem die 
Stufenbreite (SB) und die Stufenhohe (SH) zur Ba- 
sis 2 exponentiell ansteigen. 

5. Verfahren zur Auswahl eines wahrscheinlichen 
Bewegungsvektors nach Anspruch 4, 

bei dem die Quantisierung durch Maskierung der 
binaren Darstellung der jeweiligen Differenz er- 
zeugt wird wobei die binare Darstellung der jewei- 
ligen Differenz, ausgehend vom hochstwertigen 
Bit, Bit fur Bit nacheinander solange ausgeiesen 
werden, bis das gerade ausgelesene Bit verschieden 
von Null ist, und 

bei dem fur einen jeweiligen Bewegungsvektor ft j) 
alle quantisierten Differenzen selbst beziehungs- 
weise alJe quadrierten quantisierten Differenzen 
uber alle Pixelvektoren der Mehrzahl von Pixel- 
vektoren (k, 1) dadurch zu einem jeweiligen Sum- 
menwert (LPDCft j)) aufsummiert werden, daB das 
von Null verschiedene Bit steUenrichtig bezie- 
hungsweise um die seiner Wertigkeit entsprechen- 
de Anzahl von SteUen erheht in einem Akkumula- 
tor aufaddiert wird 
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@ Method and means for detecting people in image sequences. 

(57) The head in a series of video images is iden- 
tified by digitizing sequential images, subtract- 
ing a previous image from an input image to 
determine moving objects, calculating bound- 
ary curvature extremes of regions in the sub- 
tracted image, comparing the extremes with a 
stored model of a human head to find regions 
shaped like a human head, and identifying the 
head with a surrounding shape. 
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BACKGROUN D QF THE INVENTION 

T*. invention * to methods and means for detect poop, in imag e seance, and particularly 
for.c^U in video ima.es to 

Locating peop.e in video images ^^^^U compression and other applicafons. 
chines, automatic security appl.cat.ons. "^^£^£ lma J» by subtracting corresponding .mage . 

U.S. Patent No. 5.086,480 attempts to .denhfy people ,n vtfeo g y ^ thfesho|d tQ e( 

e.ements of subsequent images. ^^^^J^^S^L minimum rectangle which wili conta.n 
noise filtering and clustering the resulting data Mts - d ""™T"L and generating a head code book from 
Tsets. generating a border of ^^^SSSX^ * *° SetS ° f 

the elements in the original images that ^^^^ that there is a moving head within any .mage 

m ^sr=t=^ 

Another object of the invention to to overcome tne aior^.^ 

SUMMARY O P THE INVENTION 

the invention will become evidentf rom the following aew 
drawings. 

35 BRIEF DESCRIP™ * OF THE DRAWINGS 

to an embodiment of the invention. 

Fiq 4 is a picture of the dilated image of Fig. 3. 

Rg'" 5 is a block diagram showing details of a step in Fig. 2. 

to the invention. rMU ltina from processing of the dilated images in Figs. 11 to 

Figs. 11Ato 22A are examples of «^ J^^JS.n, hea d portions of the contours, and show- 
22 according to the invention with annuluses drawn over the possmie n 
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DETAILED DESCRIPTION OF PREFERRED EMBOmMcirre 

atleastone person. The processor ^Zl^^n^T^ 0 ! determines Aether a sceneincludes 
camera VC1 , to the display DI1 , SKtS? 1 * ^ ^ direCt either to the vide ° 
VC1, the control signals point £^ZZ££S?L Si9nals to *° «"« 

desired location, preferably the center Wn^^ected o me d "!, S ° ZfJt** the SCene in a 

electronically at a desired ^ V °" ' the COntro1 si9nals P |ace the P^on 

-order.may record^ 

the person in the scene, f rem 7e 1"^!^ 1 T* ^ ° UtpUt ' Which CBBta » * P-'*™ 
P'a/ the unprocessed images from^^ 

or otherwise located from the processor PRl»Z!?h. processed images, with a person centered 

*Jr*di d ^ 

video input from the camera VC™ * the processor p RL in step 101 receives the 

of two successive images from the other aJutZ^^l '"^ a " d ln Step 110 sul *racts one 

from corresponding picture efeLts^^ 

mediately successive frames (or imaaes) hut «Z J, kJ^!T preferably subtracts elements of in> 

ess. In step 114 the processor Pr^o™ ah^f? 0backwa " ls a ™"ber of frames for the subtraction proc 
step 117, compares S^TSSXSZ StS^I^ * th6 SUb,raCted ima ^- a "« ■" 

threshold to 1 and values less tha , oTeTua) to th^hrLVhn^ " PR1 S6tS ValueS 9reater tha " tne 
purpose of formation of abso,ute"^Txeo^Z£ Pr0dUCe 8 Wnary motion imaoe - The 

step 117 is to remove temporal noise P the """O*™" "Potion with a threshold value in 

v^g'^ 

-abs.ute value. endtnreshJd^^^^^ 

dU.y.t) = J 1 i£ |f(jt '^«-f(*-y, t-r)|>r 

I 0 otherwise (1 ' 



baCkWartS *• ProcessorPRI seiects for 

*e processor PRIincludesI^^ 

and allow a viewer to improve det^uon of a .o» ™ T F ° 10 Chan9B the T l ° hi9her inte 9 ral values 

According to one J™«*TtZ^X™*2T^ ™* ^ ^ F ° ™ y " e automatic " 
remove all the noise. However J^XS^^- L represents tne Ulresn< "1 value is selected to 
signal as well. According ™£££££Z^ ?T a D Si9nificant P° rti °" " valid difference 

generates an image with s^ randon^ uses a lower threshold value T which 

noiseunti,a,a,er 9 ;^ 

of the threshold T to eliminate ™- ! t embod,ment - a user controls input which sets the value 

mentation. 

g notse which can affect the accuracy of the object/background seg- 

impe^obj^^ 

the object OB in H^Z^t^^,^* T* 330 ' PR1 ™* ™* <"**nto the head HE on 
the processor PR Tin step ™££^^£?'T* m <*«*"**>• T° improve segmentation, 
shown ,n the dilated object DoT£ T^m^^ 9 *, "" ^ 93PS ,he b '" arv motlon lmaae as 
embodiment of the invention. m2£KtaE£-ta ^ k h , " e reSO ' Uti ° n 3nd acCuracv - to «n 

* ' mor P no, °9 |Cal losing. wh,ch is dilation followed by erosion fills the gaps How- 
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ever, to save processing time, a preferred embodiment of the invention omits the erosion step and accepts 
the fact that the object size will increase slightly as a result of dilation. 

Fig. 4 illustrates the dilated figure and how it improves object segmentation. A hand HA appears in both 
Fig. 3 and Fig. 4. 

5 In step 127 the processor PR identifies the boundaries of the difference image by tracing the contours of 

the outline of the dilated difference image shown in Fig. 4. The article by O. Johnson, J. Segen, and G.L. Cash, 
entitled Coding of Two Level Pictures by Pattern Matching and Substitution, published in the Bell System Tech- 
nical Journal, Volume 62, No. 8, October 1983, discloses the contour tracing of step 127 and finding regional 
features by calculating local boundary curvature extremes of step 130 in Fig. 1. In using the process of the 
10 aforementioned Segen article, the processor PR1 finds the coordinate values of the region boundaries (x(i) 
and y(i)). In step 130, the processor PR1 independently smoothes the coordinate values of the region boun- 
daries with a rectangular fitter of length 2k + 1 where k is a parameter for computing the k-curvature of the 
contour. Determination of the k-curvature of a contour is known and appears in the article by A. Rosenfeld, 
and A. Kak, Digital Picture Processing, Academic Press, 1 976, ISBN 0-1 2-597360-8. It also appears in the 1 983 
is SPIE artide from the SPIE Conference on Robot Vision and Sensory Control, in Cambridge, Massachusetts, 
entitled Locating Randomly Oriented Objects from Partial View, by Jakub Segen. The components o* the step 
130 appear in Fig. 5 as steps 501 to 524. They start with the smoothing step 501, the slope determining step 
504, and the computing step 507. 

According to another embodiment of the invention, the curvature is determined otherwise. However, k- 
20 curvature is simple and sufficient for the purposes of this invention. 

In Fig. 5, step 507 calculates the orientation O(i) ork-slope as follows: 

d x = *{i-k) - *</-*) (2) 
d y =/(/+*)- /(/-*) (3). 
O(0 = Atan2 (d y , tf x ) (4) 

25 Where the primes denote the smoothed version of the contour, and the Atah2 function computes the arc tangent 
of d/d x to the range of -n to n as in rectangular to polar coordinate conversion. The curvature is computed as 

C(/) = 0(i + |) - 0{i + |) (5) 

In step 510, the processor PR1 smoothes the curvature of the image; determines the derivatives of the 
30 curvature in step 514; locates the significant zero crossings of the derivative of the curvature in step 517; and 
determines the normal to the curve at each significant zero crossing in step 520 using two points on the con- 
tour, each separated by the Euclidean distance k from the point of the significant zero crossing. In step 524, 
it stores parameters for each of these •feature" points (i.e. zero crossings of the derivative), namely the x-y 
location, curvature, and the angle of the normal to the curve. The curvature is positive for convex features 
35 and negative for concave features. The processor PR1 stores the feature points in the order that they appear 
In the contour which the processor traces clockwise. 

Fig. 6 illustrates the results of contour tracing in step 127 and the calculation of local boundary curvature 
extremes set forth in step 130 and in steps 501 to 524. Fig. 6 includes the normals NO as well the traced con- 
tours CN: 

40 At this point in the processing, the processor PR1 has reduced the data from regions to contours to feature 

points. The processor PR1 now proceeds to locate features corresponding to head and neck shapes from the 
set of feature points. For this purpose, the processor PR1 uses a simple, hard coded (not learned) model of 
the shape of the head and neck in the model input step 134. A representative RE of the model appears with 
step 134. 

45 In step 1 37 the processor PR1 matches regional features with the stored model. In this step, the processor 

PR1 looks for a sequence of feature points that indicate concavity at the left side of the neck, convexity at the 
top of the head, followed concavity at the right side of the neck. It examines only the sign of the curvature not 
the magnitude. Because the top of the head is roughly circular, the position of the local maximum of curvature 
is highly sensitive to noise or background segmentation errors. In fact there may be more than one feature 

so point present Therefore, the processor PR1 searches for one or more convex feature points at the top of the 
head without restriction ontheir location. It limits the acceptable direction of the normal to the contour at the 
neck points to ensure that the detected head is roughJy pointing up. It accepts only the normal to tha object 
at the left neck point in the range of 90 to 225 degrees, and the right neck point from -45 to 90 degrees. This 
restricts overall head tilt to about ± 30 degrees from the vertical. Figs. 7,8, 9, and 10 show objects OB in images 

ss which represent examples of correct detection from contours obtained from dilated, binary motion images. 
Straight lines SL connect locations of the local maxima of curvatures MA at the neck. The matching step 137 
does not require the presence of feature points corresponding to the shoulders. 

In step 140 the processor PR1 identifies a possible head shape, it calculates the neck width from the pos- 
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itions of the neck feature points. It compares the neck width to a gross size and determines that the left neck 

TnTs^T VT ,eftof ^ ri9ht neck poinL 11 a,so ^^^p-»-*ofih.po«i l SLS2S 

metric on "^'^ WnOSO perimetera exceed a °>en perimeter threshold. This minimum peri- 
meter restnction results .n sk.p P .ng remaining isolated noise region. The processor repeats steps 137 and 140 
for each region which is a possible head. 

Figs. 11 to 21 1 show other difference images DA and Figs. 1 1A to 21 A show corresponding contours with 
feature po.nts and I normals NO at feature points FP. Each figure number followed by Zetter A reprints 
the contour of the figure with the corresponding figure number. The process up to step 144 detertMhe head 

.HAt iT ^ ° ,her ShaPBS "* 38 inV6rted T - ShaP6S SH " ^ examSrslin !£ 

.«r n ° rt °K te ^ ^: ,he processor PR1 used on| y the feature points to match shapes. In step 144 the proc- 
essor goes back to the contour itself. In step 144. the processor finds a possible head's center by compu 

Tol^Tsl hh SeSment ° f C ° ntoUr thattraverses the P°^b.e "-d and is terminated by posXe ne* 
be^,; t 9 !• """"f* the neck P° ints enters the centroid calculation. The radius of the head tkTn 
becomes the mean distance from the calculated center of the head to the contour. Details of step 144 aocear 

cues' th", cT ? f° 7 " 22 ' Here ' SteP 2201 ,he processor PR1 c <™*«s « ^ e kVo^ts Tm- 

StecTSZZ T ,eaSl 1 b,8 ( hea , d in step 2204 ' and determines the likely head radius in step 2207. 
circuX £I£ J '? nty K° f 3 POSSib ' e h6ad by aSSi9ni "9 a conf idence level '» each ^'ection. The 

t^ventio^ H^tlZ- ! 3PeS °, n toP ° f ne< *- ,ike StrUCtUreS - Acco " ,in 9 to anotner embodiment of 
tne invention, other feature points are used as a confidence metric 

th« £J?T in f i9S \ 11At ° 21Al ,hB processorPR1 detects ^e confidence level (step 147) by determining 
emiXS P ° irtS ,hat " e approximate * en annuius whose radius extends from the £ 

ad ^ o™ ™ T ^ e " SOrth ° f radiUS to ,he POSSib,e head radius *™ one - sixt " °f «"e possible head 

the D o,^.£ h h ! S P,aCi " 9 a " annU ' US With a thickness to one - th ^ °' ^ head radius on 

he possible head and seeing how much of the contour the annuius covers. Heads are actually more elliptic^ 

°' 1116 a " nUlUS iS SUff ident 10 «"P~» *r the head-stccenS 

threshold o^nZT^ ^ P T?, SSOt PR1 S6,eCtS hBad images only if are over a confidence image 
threshold percentage. This threshold is a selected default value which a user can override. Atypical default 
value ,s40% According to an embodiment of the invention, suitable means allow the userto chaVge TJ JS2 

' onlv'if TZ l^H e r roC8S f ^ PR1 d ! Ve ' 0pS a " cumulative confidence level and selects possible head shapes 
mia^ * cumulate confidence threshold: According to an embodiment of the invention, suitable 

m"e the le heaoVrna'c 9S thresho,d - '" step 'he processorPRI determines whether 

it For^xlnt ? T . CUmulative confidence threshold. If so. it selects one head image by 

default. For example, the default head image may be the center one. According to an embodiment of the in- 

mZ : e oTt r r ra,eS ^ inPUt '° ChanQe the d9 ' aul «~ le . ^«am P ie to select the fastest moving hea" 
app.0 heseZe hL^T, T h !f identified 8 P8rSOn by the head ima 9 e and uses that information to 
fh^mln- frth f h . T 9 f thS V,de ° ima9e 30 aS 10 image and focusthe video camera onto 

the image with the control as shown in step 160 

a list^oo'^hld 2' 23 i " USt '!! eS deta ' IS ° f St6P 154> Her8 ' in S,ep 2301 ' ,he process °' PR 1 stains 
a list of possible head shapes exceeding a predetermined threshold confidence level. In step 2304 it maintains 

a h,story of detections of each possible head with position, sfee, confidence level, time stamp andl™ a 

Zl^Z^nTo T""* t * PrBViOUS ""^ ^ * C ° mbin6S 3 a " 
1™!^ ,h , 9 *° Cumulative ~"fWence level and then combines newly detected confidence 

SSlZ!* t^^: 1 '^ teVeL A reCUfSiVe ' OW P3SS ,ilter in the prOCBS - r PR1 sm °o'he S the 
PrT^^J, ? t T h *Z " eW P ° SSib,e h6ad imageS enter the camera ' s field of *ew. the processor 
PR1 adds them to a list of possible heads and places them in position for selection as a default. 

pr.™ 7 ' ,f Io f ,ion °f 3 new "election is close to one of the previous possible heads, the processor 

Ltec^s JZZt t T t0f * he ear " er POSSib ' e h6ad - The processor PR1 the " adds the each new 

^ thB cumulative co"fdence level, and its size modifies the filtered size. In the 

an^ddinn Tn.; f t T*™? ™" CalCUlateS ,he ,i,tered sfee «V tekino M % °f the old filtered size 
mX vl d l1„ stl 2310 T* deteC * ed 0ther tyP6S ° f ' OW - paSS f ilters can be - a d and the percentages 
fiZZZL^f f l ; ! processor determines if the new detections is close to two objects and attributes 
™ T 3 teC " 0n - Thismakes possible correctlyto tracka person who passes a stationery 

person. The moving person represents the more recently detected one than the stationery one 
oe J?n S l?~~ ™ 6 prOCesaor PR decrements the cumulative confidence level each time a head fails to ap- 

exceeds iThTsh^ PR ^ 33 3 Valid nead «« « he confidence . 

exceeds a threshold. This assures confident detection several times. 
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In step 2317, the processor PR determines that no person, i.e. no head appears in a frame. If it detects 
little motion, as indicated by small difference-image regions, the processor does not update the background 
image. This conditional background subtraction corresponds to increasing the x parameter in equation (1). This 
effectively decreases the temporal sampling rate and effectively increases the speed of the objects. 

According to another embodiment of the invention, the processor PR utilizes static background subtraction 
by repeatedly subtracting a background frame acquired at some time f=/ 0 from the sequence. In equation (1), 
f(x,y,f-x) would become f{x,y,t 0 )e. Such static background subtraction offers the advantage that the difference 
signal is the same regardless of whether the person is moving or at rest This contrasts with the frame sub- 
traction, i.e. dynamic background subtraction, which the signal goes to 0 if the person stops moving. The object 
velocity does not affect the detection rate with static background subtraction. 

According to another embodiment of the invention, the processor PR subtracts off a temporally low pass 
filtered version of the sequence instead of subtracting or comparing previous frames. That is, it compares the 
input image to the low-pass version. 

According to an embodiment of the invention, the processor PR1 utilizes image processing hardware such 
as a Datacube MaxVideo20 image processing system, a general purpose computer such as a SKYbott i860 
single board computer, and a Sun Sparc engine 1e. The Sun Sparc engine acts mainiy as a system cuniroiier. 
The processor PR1 hardware includes the processing units and other peripherals to form the means for per- 
forming the steps in each of the figures, other than those performed outside the processor. The particular hard- 
ware disclosed is only an example, and those skilled in the art will recognize that other processing equipment 
can be used. 

The camera CA uses a 4.8 mm c-mount lens, a 2/3 inch CCD in a Sony XC-77 camera. The processor 
PR1 digitizes the image to 512 by 480 pixels which have a 4.3 aspect ratio. This yields an active detection 
area from 1 foot to 10 feet from the camera with an 80 degree horizontal and 60 degree vertical field of view. 
According to an embodiment of the invention, timing of the digitizer is changed to produce square pixels. It is 
possible to get a full 80 degrees with square pixels by digitizing a 682 by 480 pixel image. 

According to one embodiment of the invention, the setting of T in equation (1) is 1 3. According to another 
embodiment of the invention T=8 in order to lose less of the signal. In the processor PR1 , background removal 
and dilation take place on the MaxVideo20 image processing system. This is a pipeline system in which low- 
level, full frame operations take place in real time. The MaxVideo20 image processing system's 256 by 256 
lookup table and double buffers serve for background removal. The SKYbolt computer performs the remaining 
processing. 

According to an embodiment of the invention the processor PR1 uses a convolver to dilate. Specifically 
it uses the Max Video 20 image processing system's 8x8 convolver to perform the dilation operation. Dilation 
with this large kernel provides increased region growing performance. Convolution of the image f[x,y) with an 
8x8 kernel h(ij)'\s 



40 



9U,y) = £ £ £{x+i.y+j) h{i,j) 



(6) 
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If a binary image f{x,y) with values zero and one, is convolved with an 8x8 kernel of all ones (n(/j) = 1), 
the resulting image g(x,y) will have values from zero to 64. This is normally thought of as a low pass filtered 
image, but in this case the grey scale values can be interpreted differently. These values indicate the number 
of non-zero pixels in the 8x8 neighborhood surrounding each pixel. 

Dilation involves the concept of passing a structuring element (kernel) over an image and setting a one 
in each pixel at which there is a non-empty set intersection between the image and the structuring element. 
Intersection is defined as the logical "and" operation for each member of the structuring element and the cor- 
responding image date. Setting all values greater than zero to one in the convolved image g(x,y), produces 
the same result as dilating the original image with a 8x8 structuring element. If 
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l if £ £ f(x+i,yj)>0 
0 otherwise 
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then gr" is the dilation of fwith structuring element H (with h(ij) = 1) 



(8) 



iS / 'j ranslated * the point <*«*»■ This is disdose < i" the publication by A.K. Jain, Fundamentals of 
Digital Image Processing, Prentice-Hall, Inc., 1989, ISBN 0-13-336165-9 

■■nJfT" 1 * 10 a " ^° d j ment of the invention, rather than using all the non-zero values in the resulting 
mage, the processor PRIthresholds it to remove isolated noise pixels. According to another embodimen of 
the ,nven ,on. ,n a first step, the processor PR1 places the condition that there must be at least n. foVe^mpie 
oZL^uZ*? 1 5X8 ™ 9 '°" f ° r 019 Center P" 81 to be considered part of an object The next noise removal step 
occurs during extract™ of the contour, where the boundary tracing routine rejects small (noise) regions. 

According to an embodiment of the invention, in contrast to the full frame processing described above 
the processing fakes place on a general purpose microprocessor (an Intel i860 in a SKYbolt single board com- 
puter) running C code. The input to this section is the output of the full frame processing section: A two di- 
mensional, eight bit array which is the dilated frame difference image 

The contour extraction generates a list of X-Y coordinate pairs that correspond to the boundaries or closed 
contours of regions in the difference image. An embodiment of the invention, in order to speed processing 
employs some short cuts on the standard contour following algorithm disclosed in the aforementioned Jain ar- 
ticle. First the image is sparsely sub-sampled vertically while searching for objects (only every 20th line is 
examinedOThis causes skipping of some small objects so it essentially imposes a minimum height requirement 
for the objects. Second, tracing of the contour involves subsampling the image data two to one in both direc- 
10ns. Only even numbered pixels on even numbered rows are examined. Third, no attempt is made to find 
internal contours (e.g. the center of a doughnut shape would not be found ) 

Mr* n !! h ^ 1 emb ^ ime " t !L ndS mUl " ple ° bjectS ' i e - P° ssible neads - while av °i<*ng tracing the same object 

^Zf^t^J'.^ZF**" ' he ima9e f ° r **** the entire Searcn pattern < everv ° ther P« e ' °" 
every 20th line of the dilated difference image) is thresholded. If the pixel is non-zero, it is set to one. It is not 

HTSIin^h . , . th f . Wh ° le image - ° nly thS PlaCM b8in9 searehed - T »* system begins to scan the image 
to f ind blobs (as indicated by non-zero values.) When a blob is found, its boundary is traced and stored to be 
used later. Then it tags the blob as having been traced by writing a tag level (for example use the value 2) into 
the image along the blob boundary. Processing actually modifies the image data as processing occurs This 
leaves three possible types of pixel values in the search path: zero, which indicates no object one. which In- 
dicates a new object to be traced; and two, which indicates an object that has already been traced. As the 
scan proceeds, the following algorithm Is used: 

• If the pixel value is zero, skip to the next even numbered pixel (i.e. continue searching ) 

• if the pixel value is one, trace the contour of the object and tag the blob 

• if the pixel value is two, keep following the line until another two is found '(which indicates the right hand 
edge of the Mob.) 

The block diagram of Fig. 24 illustrates a teleconferencing and televideo system embodying the invention. 
Here, the v,deo camera VC1 is at one televideo station TS1 and passes detected video and audio signals to 
the processor PR1 . The display DM displays the video output and plays the audio outputfrom the video camera 
VC1 on he bas IS of the processing by the processor PR1. The video camera VC1 records a scene which in- 
cludes at least one person, the processor PR1 also emits control signals to the video camera VC1. to center 
the video camera on the person or to cause the display to center the person electronically. 

A transmission line TR1 transmits the video signal that the processor PR1 develops for the display DI1 
a f ! ? ™ h JL aUd, ° Si9nal to 3 P rocessor p R2. corresponding to the processor PR1. at a second televideo 

!£o L P rocess °f PR 2 also produces control signals which can, upon command, control a video cam- 
S£ i f" 86 " t0 ° enter °" 8 person - The P rocesst * PR2 also processes signals from the video camera 

VC2 and displays those signals in a display DI2 and upon command, can center a person in the display 
™ Tne P rocessors PR1 and p R2 have respective manual inputs MI1 and MI2 which cause displays DI1 and 
DI2 each selecKvely to display the processed input from the video cameras VC1 or VC2 or both. A viewer at 
either the end of the video camera VC1 can thereby choose to display the processed images from the video 
camera VC1 or the processed images from the video camera VC2. or both. Similarly, the viewer at the end of 

1~ ^TT, Ch °° Se ,0 d ' 8play ' °" th8 dlSp,ay °' 2 - eitner,he Passed video images from the 

video camera VC£ or the processed video images at the video camera VC1. orboth. The selection of displays 
is entirely within the d,scret.on of the viewers, and hence the transmitters, at either end of the transmission 
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line TR1. Typically, a viewer of the display DM would wish to see the processed images from the camera VC2 
and the viewer at the displayed DI2 would wish to see mostly the processed images from the video camera 
VC1. Each viewer would be expected to switch the viewing scene only temporarily from time to time to the 
local scene rather than the remote scene. 
5 The processors PR1 and PR2 also permit each viewer to select the unprocessed views from each of the 

cameras VC1 and VC2. 

The system of Fig. 24 allows automatic panning. It permits participants at one site to control a camera at 
a distant site and permits panning and zooming to get a better image of the portion of the scene that is of interest 
to the viewer. 

10 Fig. 25 shows a picture-in picture for tele-education and tele-lecturing. The instructor's face FA1 appears 

in a window WI1 that is superimposed on an image of notes NT1 being presented. In the systems of Figs. 1 
and 25, the processors PR1 and PR2 include means for selecting the instructor's picture and superimposing 
it in the position shown. 

The invention helps the acceptance of video telephony because the user need not remain positioned di- 
15 recti y in front of the terminal to be in the camera's field of view. In video telephony and teleconferencing, the 
automatic camera panning frees the user to concentrate more on personal interaction and iess on such tech- 
nical issues as camera viewing angles. It eliminates the need for a camera operator in tele-education. It reduces 
the cost and the complexity of tele-education. 

According to an embodiment of the invention, the orientation of a person's head acts as a source of com- 
20 puter input to control a cursor. Alternatively, the person detection serves as a pre-processing step for gaze 
tracking. In turn, gaze tracking serves as a human-machine interface. 

According to yet another embodiment of the invention, a system of Figs. 1 and 25 operate as a video motion 
detection system in television surveillance. The system automatically switches the input to the operator's mon- 
itors which view only scenes .with motion in them. The system discriminates between people in the images 
25 and other moving objects. It raises an alarm (not shown) upon detection of a person. 

The system of Fig. 1, in an embodiment of the invention, has the processor PR1 store images on the VCR 
so that it responds only to images with people. This reduces data storage requirements by extracting sub-im- 
ages containing persons. 

The apparatuses of Figs. 1 and 25, in another embodiment of the invention, record traffic patterns of pa- 
30 trons in retail stores. This permits evaluation of the effectiveness of a new display of or arrangement of mer- 
chandise by examining the change in traffic. An eye catching arrangement would result in increased dwell time 
of passers by. 

The invention improves the potential for image compression by incorporating knowledge of locations of 
persons on the image. For example, a first step involves feeding a sub-window at full camera resolution to the 
35 image and coder instead of subsampling an entire image. The invention permits person detection to select 
the subwindow of interest. This essentially uses electronic camera panning as a compression aid. 

The invention makes it possible to detect a person almost anywhere in a scene with a single camera. It 
can operate in normal office environments without requiring special lighting or imposing background scere re- 
strictions. It permits real time operation. It avoids special startup procedures such as acquiring a background 
40 image with no persons present. It furnishes robustness in the face of camera position changes or scene 
changes such as lighting changes. 

The invention may be used as a pre-procesing step in thetype of face recognition described by N. Farahati, 
A. Green, N. Piercy, and L. Robinson in Real-Time Recognition Using Novel infrared Illumination, in Optical 
Engineering, august 1992, Vol. 31, No. 8, pp 165B- 1662 
45 According to another embodiment of the invention, the video cameras VC1 and VC2 record not only video 

but audio signals needed for teleconferencing and other purposes. The displays DI1 and DI2 as well as the 
recorder RE1 include audio equipment for audio output and recording as needed. 

In the processors PR1 and PR2, according to the invention, each step performed by the computer com- 
ponents, such as determining, producing a signal, etc. generates a physical signal in the form of a voltage or 
so current The processors PR1 and PR2 hardware includes the processing units and other peripherals to use 
these signals and form the means for performing the processor steps in each of the.f igures. 

In an embodiment of the invention, either or both of the cameras CA1 and CA2 utilizes a wide-angle lens 
in the process of identifying the region of a head. After reaching a satisfactory cumulative confidence level, 
either or both the processors PR1 and PR2 zooms in on the head by electronic panning, tilting, and zooming 
55 in a known manner. The reproduction of the zoomed head now increases and takes up a much larger portion, 
and, if desired, virtually all of the screen in the appropriate display DM or DI2. the image follows the now-en- 
larged head as the person moves from side to side, sits down, rises, or walks away. 

In still another embodiment of the invention the same signals that control the pan and tat of the video signal 
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serve to focus the sound pattern of a microphone on the camera on the head of the person. 

While embodiments of the invention have been described in detail it will be evident to those skilled in the 
art that the invention may be embodied otherwise without departing from its spirit and scope. 



Claims 

1. The method of locating a person in a video picture, comprising: 

forming a differential image from video images to extract differential figures; 
comparing local boundary curvature extremes of the differential figures with a stored model of a 
human head; and 

identifying a region corresponding to the model of the human head from the comparison of the local 
boundary curvature extremes of the differential figures with the stored model of the human head. 

2. The method as in claim 1 , wherein the forming step includes digitizing the images to form two dimensional 
arrays of digital data and subtracting an image from a previous image; and further includes forming a 
threshold and taking the absolute values of the digital image data and comparing them with the threshold. 

3. The method as in claim 1 or 2, wherein the step of comparing includes fitting a surrounding shape to the 
portion of the region boundary corresponding to the head. 

4. The method as in claim 3, wherein the surrounding shape is an annulus. 

5. The method as in any one of claims 1 to 4, further comprising sensing data from a subregion of an input 
image corresponding to the shape of the human head for separate operation. 

6. The method as in any one of claims 1 to 5, further comprising sensing data from a subregion of an input 
image throughout the region corresponding to the human head to transmit a human head and controlling 
a mechanical system for pointing a camera to keep the head within the image. 

7. The method as in claim 5 or 6, wherein the step of sensing includes allocating the greater portion of trans- 
mission bandwidth to the subregion that contains the head. 

8. The method as in claim 5, 6. or 7. wherein the step of sensing includes selecting one of several cameras 
in a system on the basis of the sensing so as to display the camera with the person in its field of view. 

9. The method as in claim 5, 6, 7, or 8, wherein the step of sensing includes storing statistical data about 
the motion and the presence of people in a scene. 

10. The method as in any one of claims 1 to 9 wherein the step of identifying includes placing the image of 
the head at a predetermined position in the video picture. 

11. The method as in any one of claims 1 to 1 0, wherein the step of forming includes subtracting images sepa- 
rated from each other by a time t and varying the time x to adjust the figures. 

12. An apparatus for locating a person in a video picture, comprising: 

means for forming a differential image from video images to extract differential figures; 
means for comparing local boundary curvature extremes of the differential figures with a stored model 
of a human head; and 

means for identifying a region boundary corresponding to the model of the human head from the 
comparison of the local curvature extremes of the differential figures with the stored model of the human 
head. 

13. An apparatus as in claim 12, wherein the forming means includes means for digitizing the images to form 
two dimensional arrays of digital data and subtracting an image from a previous image; and further in- 
cludes means for forming a threshold and taking the absolute values of the digital image data and com- 
paring them with a threshold. 

14. An apparatus as in daim 11 or 12,wherein the means for comparing includes means for fitting a surround- 
ing shape to the portion of the region boundary corresponding to the head. 
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15. An apparatus as in claim 14, wherein the surrounding shape is an annulus. 

16. An apparatus as in any one of claims 12 to 15, further comprising means for detecting data from a sub- 
region of a differential image corresponding the shape of the human head for separate operation. 

17. An apparatus as in anyone of claims 1 2 to 1 6, further comprising means for sensing datafromasubregion 
of an input image throughout the region corresponding to the human head to transmit a human head and 
further including means for controlling a mechanical system for pointing a camera to keep the head within 
the image. 

1 8. " An apparatus as in claim 16 or 17, wherein the means for sensing includes means for allocating the greater 

portion of transmission bandwidth to the subregion that contains the head. 

19. An apparatus as in claim 16, 17, or 18, wherein the means for sensing includes means for selecting one 
of several cameras in a system on the basis of the sensing so as to display the camera with the person 

Ill 113 I ICIU vn »■<-... 

20. An apparatus as in claim 16,17, 18, or 19, wherein the means for sensing includes means for storing stat- 
istical data about the motion and the presence of people in a scene. 

20 21. An apparatus as in any one of claims 12 to 20, wherein the means for identifying includes means for plac- 
ing the image of the head at a predetermined position in the video picture. 

22. An apparatus as in claim 1 2 to 21 , wherein the means for forming includes means for subtracting images 
separated from each other by a time x and means for varying the time x to adjust the figures. 
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